home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Deutsche Edition 1
/
Deutsche Edition 1.iso
/
amok
/
031-040
/
amok35
/
spellchecker
/
spellchecker.doc
< prev
next >
Wrap
Text File
|
1993-11-04
|
8KB
|
135 lines
SpellChecker - short Documentation for Users
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The SpellChecker is a program to checking the right writing of words
in textfiles. You can use it without any problems with
ASCII-textfiles. If You use it with files of word-prozessing-programs,
there may be some problems, because of strange control-sequences in
these files. A other problem can occur, if You use SpellChecker with a
word-processor, that saves the files with a constant linelength. If
You than correct errors in these lines, the new linelength will differ
from the former length, and the word-processor may have problems
reading the modified file. Test it with an old text.
The basis of this program is a list (Array) of correct written words.
If SpellChecker checks a text, it reads a word from this text and
searchs for this word in the list. If the word is found, it is assumed
to be correct, otherwise this word can be wrong and the user has the
possibility to change it. But how get we this list of correct written
words? Well, its easy. We join some textfiles (from different autors)
to a big textfile. (The textfiles we can get from PD-Docs or from
Disk-magazines). Than we read the words from this textfile and sort
than into the list. If we read a word that is already in the list, we
increase a counter related to this word. So we get a list of words.
The counters show the frequency of this words. If the list is big
enough, we can delete all words with low counters. These words can be
wrong or extremely seldom. The other words with high counters are used
often by different authors, so we assume that these are correct words.
This was the basic idea of this program, now I explain how to use it.
Start it from WB by double-clicking its Icon or from CLI with [run]
SpellChecker. But this program needs the ARP-Library, the Divice T:
and some Memory , I think 300 kByte free ram should be enough. If You
have started the program, You see a Window with ten Boolean-Gadgets
and a StringGadget. At this time the list of words (this list I will
call sometimes Lexikon) contains no words. To fill the Lexikon, You
can load it, or generate it. Generating is only necessary if You will
create a Lexikon for a new language. With this program I will give You
two Lexikons, one with english, and one with german words. For other
languages You have to generate Your own.
Generating or expanding a Lexikon is the same procedure. First You
need a big textfile with the words You want to use for the Lexikon.
The savest method is to use ASCII-Files. (If You use files of Your
word-processor, you should make a small test, see "ExportLex" ). Assume
You have a big textfile on a Disk in drive df1: with name
"BigTextFile". Now click On the Gadget "ExpandLex". You will see the
ARP-FileRequester. I think you should know this requester, so I don't
explain it. Click in this Requester on "Drives", than on "DF1:", than
on "BigTextFile" and at least on "OK". Now the Requester will vanish,
all Gadgets will be ghostered and the mouse-pointer will sleep,
indicating that the program is working and can't react on Your input.
Now the program generates a Lexikon. If the Lexikon was empty, a new
Lexikon will be generated, otherwise the current Lexikon is expanded.
This generating or expanding will take some time. For example, I have
use for such a generating a 600 kByte textfile in ram: It takes over
60 minutes to generate with this big file a Lexikon containing 11000
words. And with diskdrives it takes much more time, because the
sourcetext is readed 4 times. Sorry for this long waiting-time, but of
course You have to wait only one time.
If the generating is completed, the Gadgets will get there normal
Image, and in the textarea there You can read something like this:
Words: 9999 MinCount: 1 MaxCount 178
Now you have a Lexikon with 9999 words. You should save it by clicking
on "SaveLex". The ARP-Filerequester appears. You can use the default
name "Lexikon" to save it, or You can change it in the Stringgadget.
Than click on "OK" to save it.
Well, now You have 9999 words, but some of this words may be wrong
written or may be extremely seldom words or names. To delete these,
click on the Gadget "CleanLex". Every click will delete all words from
the Lexikon that have the lowest counters. The first click will
delete words with counter=1, the next click the words with counter=2
and so on. Don't worry, if you have clicked too often, so that there
are now only a few word in the Lexikon, You can load the original
Lexikon back from disk by clicking on "LoadLex".
For example, if MinCount=3, then all words in the Lexikon are found
tree or more times in the text that You have used to generate the
Lexikon. If MaxCount=56, there was no word in this text that was found
for more than 56 times. (MaxCount will not grow, if it already 255)
Now You have a Lexikon which You can load each time when You are using
the SpellChecker. To check a textfile for correct writing, load the
Lexikon and then click on "CheckText". Now load this textfile in the
same way as You have loaded the Lexikon. After loading, the
SpellChecker will start to examine the text. It read from this text a
word and searchs for this word in the Lexikon. If it founds it, this
word is assumed to be a correct written word and SpellChecker reads
the next word. Otherwise, if the word is not found in the Lexikon, this
word can be wrong written, and You, the user, have to decide if it is
correct or not. You can correct this word in the StringGadget. If You
hit "Return" or click on "Ignore", this word is corrected in the
textfile, but it is not added to the Lexikon. If You click on
"AddToLex" this word is corrected and added to the Lexikon. The
SpellChecker distinguish between words with upper and lower case!
SpellChecker knows, that the first letter of a sentence have to be
upper-case. If You click on "AddToLex", You have to pay attention to
the first letter of this word. Add words only to the Lexikon in the
form as the word is written in the MIDDLE OF A SENTENCE!!! For example
in the sentence: "This is a short sentence" the first letter in the
word "This" is upper case, and this is correct, but You should NOT add
this word in this form to the Lexikon because the normal writing of
the word "this" is with the first letter in lower case!
You can correct all words in this way, or cancel this operation by
clicking on the "WindowCloseGadget" or on "Quit". If You click on "Quit"
or the "WindowCloseGadged" before all words are corrected, the
textfile will be unchanged.
If You click on "DelWords", You can delete single words from the
Lexikon. For example, if You suppose that there is a wrong written
word in the Lexikon, than click on "DelWords", type this word in the
Stringgadget and press "RETURN" or click on "DeleteIt" to try to
delete this word. If this word exists in the Lexikon, then
SpellChecker will delete it, otherwise it will display a text with the
message "Word not found". To leave this mode, click on "Quit" or the
"WindowClosegadget".
The last Gadget is the "ExportGadget". With this Gadgets it is
possible to export all words of the Lexikon to a textfile. To export
it, click on "ExportLex" and than enter a name for this textfile.
After exporting, You can use an editor to look on this file and to
delete words You don't like. Than You can import this file again with
the Gadget "ExpandLex".
Hint: If You use CleanLex, all deleted words are exported to a file
"T:CleanLex.txt". You can use this file in the same way as the
Export-File, for example You delete all wrong words with an editor and
than import the other words again by using "ExpandLex".
Stefan Salewski, 16 March 1990